Isolated nucleic acid molecules which encode T cell inducible factors (TIFs), the proteins encoded, and uses thereof

ABSTRACT

The invention involves isolation of nucleic acid molecules, the expression of which are upregulated by interleukin-9. The amino acid sequences of the proteins which correspond to the nucleic acid molecules show some structural features of cytokines. In addition to the nucleic acid molecules and the proteins, various uses of the molecules are disclosed. The molecules are referred to as T cell inducible factors.

RELATED APPLICATIONS

[0001] This application is a continuation in part of Ser. No. 09/354,243, filed on Jul. 16, 1999, which in turn is a continuation in part of Ser. No. 09/178,973, filed Oct. 26, 1998. Both of these applications are incorporated by reference in its entirety.

FIELD OF THE INVENTION

[0002] This invention relates to newly isolated nucleic acid molecules and their uses. The nucleic acid molecules are shown to be upregulated by the cytokine interleukin-9 (“IL-9”). Also disclosed are the proteins encoded thereby. They are described as T Cell Derived Inducible Factors (“TIFs”). These nucleic acid molecules encode proteins which induce STAT activation in cells. They can be used, for example, in the stimulation of regeneration of targeted tissues. Further, their inhibitors or antagonists can be used to retard, prevent or inhibit differentiation of other tissues.

BACKGROUND AND PRIOR ART

[0003] The last decade has seen knowledge of the immune system and its regulation expand tremendously. One area of particular interest has been that of research on the proteins and glycoproteins which regulate the immune system. One of the best known families of these molecules are the cytokines. These are molecules which are involved in the “communication” of cells with each other. The individual members of the cytokine family have been found to be involved in a wide variety of pathological conditions, such as cancer and allergies. Whereas sometimes the cytokines are involved in the pathology of the condition, they are also known as being therapeutically useful.

[0004] Interleukins are one type of cytokine. The literature on interleukins is vast. An exemplary, but by no means exhaustive listing of the patents in this area includes U.S. Pat. No. 4,778,879 to Mertelsmann et al.; U.S. Pat. No. 4,490,289 to Stern; U.S. Pat. No. 4,518,584 to Mark et al.; and U.S. Pat. No. 4,851,512 to Miyaji et al., all of which involve interleukin-2 or “IL-2.” Additional patents have issued which relate to interleukin-1 (“IL-1”), such as U.S. Pat. No. 4,808,611 to Cosman. The disclosure of all of these patents are incorporated by reference herein. More recent patents on different interleukins include U.S. Pat. Nos. 5,694,234 (IL-13); 5,650,492 (IL-12); 5,700,664, 5,371,193 and 5,215,895 (IL-11); 5,728,377, 5,710,251, 5,328,989 (IL-10); 5,580,753, 5,587,302, 5,157,112, 5,208,218 (IL-9); 5,194,375, 4,965,195 (IL-7); 5,723,120, 5,178,856 (IL-6), and 5,017,691 (IL-4). Even a cursory review of this patent literature shows the diversity of the properties of the members of the interleukin family. One can assume that the larger cytokine family shows even more diversity. See, e.g., Aggarwal et al., ed., Human Cytokines: Handbook For Basic And Clinical Research (Blackwell Scientific Publications, 1992), Paul, ed., Fundamental Immunology (Raven Press, 1993), pg 763-836, “T-Cell Derived Cytokines And Their Receptors”, and “Proinflammatory Cytokines and Immunity.” All cited references are incorporated by reference.

[0005] The relationships between various cytokines are complex. As will be seen from the references cited herein, as the level of a particular cytokine increases or decreases, this can affect the levels of other molecules produced by a subject, either directly or indirectly. Among the affected molecules are other cytokines.

[0006] The lymphokine IL-9, previously referred to as “P40,” is a T-cell derived molecule which was originally identified as a factor which sustained permanent antigen independent growth of T4 cell lines. See, e.g., Uyttenhove et al., Proc. Natl. Acad. Sci. 85: 6934 (1988), and Van Snick et al., J. Exp. Med. 169: 363 (1989), the disclosures of which are incorporated by reference, as is that of Simpson et al., Eur. J. Biochem. 183: 715 (1989).

[0007] The activity of IL-9 was at first observed on restricted T4 cell lines, failing to show activity on CTLs or freshly isolated T cells. See, e.g., Uyttenhove et al., supra, and Schmitt et al., Eur. J. Immunol. 19: 2167 (1989). This range of activity was expanded when experiments showed that IL-9 and the molecule referred to as T cell growth Factor III (“TCGF III”) are identical to MEA (Mast Cell Growth Enhancing Activity), a factor which potentiates the proliferative response of bone marrow derived mast cells to IL-3, as is described by Hültner et al., Eur. J. Immunol. and in U.S. patent application Ser. No. 498,182 filed Mar. 23, 1990, the disclosures of both being incorporated by reference herein. It was also found that the human form of IL-9 stimulates proliferation of megakaryoblastic leukemia. See Yang et al., Blood 74: 1880 (1989). Recent work on IL-9 has shown that it also supports erythroid colony formation (Donahue et al., Blood 75(12): 2271-2275 (6-15-90)); promotes the proliferation of myeloid erythroid burst formation (Williams et al., Blood 76: 306-311 (9-1-90); and supports clonal maturation of BFU-E's of adult and fetal origin.(Holbrook et al., Blood 77(10): 2129-2134 (5-15-91)). Expression of IL-9 has also been implicated in Hodgkins's disease and large cell anaplastic lymphoma (Merz et al., Blood 78(8): 1311-1317 (9-1-90). Genetic analyses of mice that were susceptible or resistant to the development of bronchial hyperresponsiveness have unraveled a linkage with the IL-9 gene as well as a correlation between IL-9 production and susceptibility in this model (Nicolaides et al., Proc. Natl. Acad. Sci. USA, 94, 13175-13180, 1997). Human genetic studies also point to the IL-9 and IL-9R genes as candidates for asthma (Doull et al., Am. J. Respir. Crit. Care Med., 153, 1280-1284, 1996; Holroyd et al., Genomics 52 233-235, 1998). Secondly, IL-9 transgenic mice allowed for the demonstration that increased IL-9 expression result in lung mastocytosis, hypereosinophilia, bronchial hyperresponsiveness and high levels of IgE (Temann et al., J. Exp. Med. 188, 1307-1320, 1998; Godfraind et al., J. Immunol. 160, 3989-3996, 1998; McLane et al., Am. J. Resp. Cell. Mol. 19:713-720 (1999). Taken together, these observations strongly suggest that IL-9 plays a major role in this disease Additional work has implicated IL-9 and muteins of this cytokine in asthma and allergies. See, e.g. PCT Application US96/12757 (Levitt, et al), and PCT US97/21992 (Levitt, et al), both of which are incorporated by reference..

[0008] IL-9 is known to affect the levels of other molecules in subjects. See Louahed et al. J. Immunol. 154: 5061-5070 (1995; Demoulin et al., Mol. Cell. Biol. 16: 4710-4716 (1996), both incorporated by reference. It will be recognized that the molecules affected have their own functions in biological systems. For example, Demoulin et al. show that many of the known activities of IL-9 are mediated by activation of STAT transcription factors. As such, there is continued interest in trying to identify molecules whose presence and/or level is affected by other molecules, such as cytokines.

[0009] The disclosure which follows describes such molecules. It was found that nucleic acid molecules encoding the proteins of the invention were expressed in the presence of IL-9, but not in its absence. Hence, these molecules are, inter alia, “markers” for the expression or effect of IL-9 in a subject. The molecules are referred to as T Cell Derived Inducible Factors or “TIFS” hereafter. These and other features of the invention will be seen in the disclosure which follows.

BRIEF DESCRIPTION OF THE FIGURE

[0010]FIG. 1 compares deduced amino acid sequences of murine and human TIF (SEQ ID NOS: 27 and 28, respectively).

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS Example 1

[0011] The murine lymphoma cell line BW5147 is well known as a cell line which can be grown in vitro, without the need to add any cytokines to its culture medium. In order to identify genes induced by IL-9, samples of BW5147 were cultured either with (200 U/ml), or without IL-9, for 24 hours. Then, total RNA was isolated, using guanidium isothiocyanate lysis, and CsCl gradient centrifugation. These techniques are well known in the art. Following this, polyadenylated RNA was purified from the total RNA, by using an oligo(dT) cellulose column. The isolated, polyA RNA was then used to generate double stranded cDNA. A commercially available oligo(dT) primer was used. Anywhere from 3-5 ug of polyA RNA were heated to 70° C. for 10 minutes with 1 μg of oligo dT, and then incubated with 5× first strand buffer (250 mM HCl (pH 8.3), 375 mM KCl, 15 mM MgCl₂), 10 mM dithiothreitol, 500 uM of deoxynucleotide triphosphates, and 800 U of reverse transcriptase. Total volume of the reaction mixture was 20 ul, and the reaction was allowed to proceed at 37° C. for one hour. This resulted in synthesis of the first stand of cDNA. Second strand synthesis was accomplished by adding 30 ul of 5 second strand buffer (100 mM Tris-HCl (pH 6.9)), 450 mM KCl, 23 mM MgCl₂, 10.75 mM β-NAD+, 50 mM (NH₄)₂SO₄, together with 60 U of E. coli derived DNA polymerase I, 2 U of E. coli RNase H, 10 U of E. coli DNA ligase, and 250 uM of deoxynucleotide triphosphates, and brought to a final volume of 150 ul. The mixture was incubated for two hours, at 16° C.

[0012] The product was extracted using phenol-chloroform, and was precipitated with ethanol. The final cDNA product was then resuspended in 200 μl of TE.

[0013] These steps were carried out for both the stimulated BW5147 cells (“tester” hereafter), and for parallel, unstimulated BW5147 cells (“driver” hereafter).

Example 2

[0014] The cDNA prepared in Example 1 was then subjected to subtraction cloning in accordance with well known methods. To do this, six oligonucleotides were prepared: 5′-AGCACTCTCC AGCCTCTCAC CGCA-3; (SEQ ID NO: 1) 5′-GATCTGCGGT GA-3′; (SEQ ID NO: 2) 5′-ACCGACGTCG ACTATCCATG AACA-3′; (SEQ ID NO: 3) 5′-GATCTGTTCA TG-3′; (SEQ ID NO: 4) 5′-AGGCAACTGT GCTATCCGAG GGAA-3′; (SEQ ID NO: 5) and 5′-GATCTTCCCT CG-3′. (SEQ ID NO: 6)

[0015] These were used as explained herein. Double stranded cDNA (2 ug), was digested with restriction endonuclease DpnII, extracted with phenol-chloroform, precipitated with ethanol, and resuspended in 20 ul of TE (10 mM Tris-HCl (pH 7.5); 1 mM EDTA). Twelve ul (1.2 ug), of cut cDNA was ligated to double stranded SEQ ID NOS: 1 and 2, in a mixture which included 4 ul of desalted SEQ ID NO: 1 (2 mg/ml), 4 ul desalted SEQ ID NO: 2 (1 mg/ml), 10 μl of 5× adapter buffer (330 mM Tris-HCl, pH 7.6, 50 mM MgCl₂, 5 mM ATP), 7 μl DTT (100 mM), and 28 μl of H₂O). The oligonucleotides were annealed to each other and to the sample DNA by heating the mixture to 50° C. and then cooling it to 10° C. over one hour, followed by adding 5 ul of T4 DNA ligase, and incubation for 12-14 hours, at 12-16° C. The mixtures were diluted by adding 140 ul of TE. PCR was then carried out on 200 ul samples, as described infra.

Example 3

[0016] To carry out PCR, 200 ul samples containing 2 ul of the ligation product in a buffer of 66 mM Tris-HCl, pH 8.8, 4 mM MgCl₂, 16 mM (NH₄)₂SO₄, 33 ug/ml BSA, 0.3 mM of each dNTP (concentration: 500 μM), and 2 ug of SEQ ID NO: 2 were first heated at 72° C. for three minutes to remove any of SEQ ID NO: 1 which was hybridized to the product of Example 2. The 3′ ends were then filled in by using 5 U of Taq polymerase (5 minutes, 72° C.). Twenty cycles of amplification were carried out (1 cycle:1 minute at 95° C., and three minutes at 72° C.), after which products were combined, phenol extracted, ethanol precipitated, and resuspended in TE buffer, at a concentration of 0.5 ug/ul. Hereinafter, this is referred to as the representation.

Example 4

[0017] The representation was then prepared for subtractive hybridization by removing SEQ ID NO: 1 therefrom by digestion with Dpn II. The resulting digest was phenol extracted and ethanol precipitated. In the case of the unstimulated sample, this resulted in the driver, while the stimulated sample resulted in the tester. Portions of tester (20 ug) were gel purified on a 1.2% agarose gel and isolated. Samples (2 ug), were ligated to SEQ ID NOS: 3 and 4, in the same way that SEQ ID NOS: 1 and 2 were ligated, as described, supra.

[0018] In a first cycle of subtractive hybridization, 0.4 ug samples of tester with SEQ ID NOS: 3 and 4 ligated thereto were mixed with 40 ug of driver CDNA. The mixture was phenol extracted, ethanol precipitated, dissolved in 2 ul of 3XEE buffer (30 mM EPPS pH 8.0), 3 mM EDTA; pH 8.0, 3 mM EDTA. This was overlaid with 30 ul of mineral oil, and denatured for five minutes at 98° C. A 5M NaCl solution (0.5 ul) was added, and DNA was hybridized for 20 hours, at 67° C. The reaction mixture was diluted to 200 ul with TE, and tRNA carrier. The samples were incubated for three minutes at 72° C. to melt away SEQ ID NO: 4, and then four PCR reactions (200 ul) were prepared. These included 20 ul of diluted hybridization mix without primer, to fill in the ends of the reannealed tester, followed by 10 cycles of amplification after adding samples of SEQ ID NO: 3 (1 cycle: 1 minute at 95° C., three minutes at 70° C.) after which products were combined, phenol extracted, ethanol precipitated, and resuspended in 40 μl of 0.2XTE buffer. Single stranded DNA was degraded by a 30 minute treatment of 20 μl of this material with 20 U of mung bean nuclease, at a total volume of 40 ul. Samples was diluted (1:5), in 50 mM Tris-HCl, at pH 8.9, followed by five minutes of heating at 98° C. to inactivate the enzyme. A second PCR was carried out, using 20 ul of the product described supra, 2 ul of SEQ ID NO: 3 (1 mg/ml), and 1 ul (5 U) of Taq DNA polymerase. A total of 18 cycles (1 cycle:1 minute at 95° C., three minutes at 70° C.) were carried out. Products were combined, phenol extracted, ethanol precipitated, and resuspended at 0.5-1 ug/μl. The product is referred to hereafter as “DP1”, or the first difference product.

Example 5

[0019] DP1 was then digested with endonuclease DpnII, as described above, and was ligated to SEQ ID NOS: 5 and 6, following the same processes described for SEQ ID NOS: 1, 2, 3 and 4. Subtractive hybridization and selective amplification, as described in example 4, was repeated, and second difference product, or “DP2”, was generated. In these experiments, 50 ng of DP1 was the tester. The driver (40 ug), was as described supra. The process was repeated to generate a third difference product, using SEQ ID NOS: 3 and 4 as adapters. To generate the third product, 100 pg of tester were mixed with 40 μg of driver. All steps of the protocols supra were repeated, except the final amplification was carried out for 22 cycles, where one cycle was one minute at 95° C., and three minutes at 70° C. This yielded the final difference product.

Example 6

[0020] The final difference products were digested with DpnII, and then cloned into the BamHI site of a commercially available vector, i.e., ptZ19R. Double stranded DNA plasmids were prepared, and then sequenced, using standard methods. The sequences were compared to known sequences in the GenBank and EMBL data bases, using a BLAST search program.

[0021] At the end of this subtraction procedure, a short cDNA fragment was identified, i.e., a fragment about 200 base pairs long. This fragment was used to screen a CDNA library from BW 5147 cells. The largest clone was sequenced. It is discussed infra. It does not correspond to any known sequence.

[0022] The nucelotide sequence (SEQ ID NO: 7), is 1121 bases long, including a 537 base pair open reading frame, which encodes a protein 179 amino acids long. The predicted molecular weight of the protein is 20,093. There are two additional ATG codons which, if they acted as start codons, would produce proteins 172 and 167 amino acids in length, with molecular weights of 19,335 and 18,770 daltons, respectively. Each form of the protein is characterized by a sequence of hydrophobic amino acids which would be cleaved off of the molecule via the endoplasmic reticulum to provide a mature protein.

[0023] Analysis of the sequence shows three AT rich motifs (TTATTTAT). These motifs are often found in 5′-untranslated regions of cytokines and oncogenes. Kruvs, et al., Science 245: 852 (1989), have shown that these repeats modulate stability of mRNA for TIF.

Example 7

[0024] The cDNA isolated and analyzed in example 6, supra, was then used as a probe to identify genomic DNA for TIFα.

[0025] A genomic library prepared from mouse strain 129 was screened with SEQ ID NO: 7, following standard methods. An EcoRI fragment from a positive clone was subcloned into plasmid pZERO and partially sequenced. The partial sequence is presented as SEQ ID NO: 8.

Example 8

[0026] A second EcoRI fragment from the positive clone described in Example 7, supra, was also subcloned. There was a great deal of homology, but the sequences were not identical. To be specific, intron 1 of this sequence was 98% identical to SEQ ID NO: 8, intron 2 was 100% identical and intron 3 was 92% identical.

[0027] What is striking about the sequences is that the promoters are not at all homologous, suggesting independent regulation. The 5′ untranslated regions are 92% identical. The first exon for TIFα is split into exon la and exon 1β. The first coding exon (which is exon 1b for TIFα and exon 1 for TIFβ) are 99.5% identical, while the second exons are 100% identical, the third exons 97% identical, the fourth exons 98.5% identical, and 96% for the fifth exon. In the untranslated 3- region, homology is 96%.

Example 9

[0028] Using the information described in example 8, supra, a cDNA sequence for the second clone, designated TIFβ was deduced, and is set forth as SEQ ID NO: 9. The genomic DNA sequence was also ascertained, in the same manner as is described, supra, and is set forth as SEQ ID NO: 29.

[0029] As compared to the coding region for TIFα, that of TIFβ has six silent changes. There are two changes which result in an inconsequential amino acid change (at both of positions 36 and 113, Val in TIFα becomes Ile in TIFβ). There is also a more significant change, at position 112, where Gln becomes Arg.

Example 10

[0030] Experiments were undertaken to study expression of the TIFs. BW 5147 cells were stimulated with recombinant murine IL-9 (200 U/ml), for varying periods of time (0.2, 0.5, 1, 2 & 24 hours). Total RNA was then isolated, using standard methods and reagents. Reverse transcription was then carried out, using 5 μg total RNA and an oligo (dT) primer. Samples of CDNA corresponding to 20 ng of total RNA were then amplified for 25 cycles using different primers. (One cycle was 4 minutes at 94° C., 1 minute at 57° C., and 2 minutes at 72° C.). The TIF primers were:

5′-CTGCCTGCTT CTCATTGCCC T-3′ (SEQ ID NO: 10)

[0031] and

5-CAAGTCTACC TCTGGTCTCA T-3′ (SEQ ID NO: 11)

[0032] (sense and antisense, respectively).

[0033] These correspond to nucleotides 106-126, and 764-784 of SEQ ID NO: 7, respectively. As a control, β-actin was amplified as well, for 18 cycles (first cycle: 4 minutes at 94° C., 1 minute at 60° C., 2 minutes at 72° C. Succeeding cycles were 1 minute at 94° C., 1 minute at 60° C., 2 minutes at 72° C.).

[0034] Following amplification, post PCR products were analyzed on a 1% agarose gel, and specific amplification was confirmed, following blotting, using internal radioactive probes. The probe for TIF was:

5′-GACGCAAGCA TTTCTCAGAG-3′ (SEQ ID NO: 12)

[0035] the conditions and probes set forth were not specific for one or the other of the forms of TIF; however, the amplification product of TIFα contains a KpnI restriction site, while the restriction site for TIFβ does not. Digestion of the amplification products with KpnI indicated that most, if not all, of the TIF mRNA induced by IL-9 was TIFα, suggesting that the TIFα expression was induced rapidly via the IL-9. The mRNA for TIFα was detectable after 30 minutes of stimulation, and reached a plateau over a 1-24 hour time period.

Example 11

[0036] Experiments were then carried out which showed that the induction of TIF mRNA by IL-9, described supra, does not require protein synthesis. In these experiments, total RNA was extracted from cells stimulated for 24 hours, as described in example 10, but with or without 10 μg/ml of a protein synthesis inhibitor, cycloheximide, for 4.5 hours. In a parallel set of experiments, cells were not stimulated. The total RNA was extracted, and RT-PCR amplification was carried out as described in example 10. Post-PCR products were analyzed on an ethidium bromide-stained, 1% agarose gel. What was seen was that the induction by IL-9 still occurred when protein synthesis was blocked. Hence, the effect of IL-9 is a direct effect, not requiring the synthesis of a protein mediator.

Example 12

[0037] In these experiments, the role of STAT proteins in induction of TIF mRNA was studied on derivatives of the cell line BW5147. The first line, BWh9R, expresses wild type human IL-9 receptors. The line BW-Phe116 is a transfectant with a single mutation (at position 116), which renders the receptor unable to activate STAT transcription factors. Still another cell line, Bw-mut6, has a mutation which renders the receptor unable to activate STAT5, while retaining the ability to activate STAT1 and STAT3. Finally, cell line BW-mut7 has a single mutation which renders the IL-9 receptor unable to activate STAT1 and STAT3, but which retains the ability to activate STAT5.

[0038] Cell stimulation, isolation of total RNA, reverse transcription and amplification of cDNA were all carried out as described in example 10 (Cells were stimulated for 24 hours. Both human and murine recombinant IL-9 were used). The PCR products were analyzed on an ethidium bromide stained, 1% agarose gel, as describe supra.

[0039] The analysis revealed that human IL-9 did not induce expression in BW-Phe116, suggesting that STAT transcription factors are implicated. It was found that IL-9 induced TIF expression in the BW-mut6 mutant, but not the mut7 variant, suggesting that STAT1 or STAT3 are involved, but not STAT5.

Example 13

[0040] The expression of TIF mRNA in normal mouse spleen cells was then studied.

[0041] Spleen cells from 10-12 week old-Balb/c mice were cultured for 24 hours in control medium or the control medium supplemented with 20 μg/ml of LPS (which activates B lymphocytes and macrophages), or ConA (which activates T cells), or ConA plus 1% of a blocking antiserum against murine IL-9, with β actin being used as a control. Purification of RNA, RT-PCR analysis were carried out as described supra.

[0042] The data indicated that TIF is, at best, very weakly expressed in resting spleen cells, not induced by LPS, but strongly induced by ConA. Anti IL-9 antiserum did not affect induction by ConA, suggesting that its effect is not mediated by IL-9, or is mediated by other cytokines.

[0043] When the ConA activated spleen cells were analyzed using sequences of RT-PCR products, it was found that these cells were expressing TIFα predominantly, or exclusively.

Example 14

[0044] Further experiments showed that TIF mRNA was expressed even in the absence of IL-9 induction.

[0045] Spleen cells from 5 week old FVB mice were enriched for T cells, using a nylon wool column. Then, the cells were stimulated for 24 hours in medium supplemented with ConA (a T cell activator), or PMA (which activates PKC in most cells), either with or without IL-9.

[0046] Total RNA was isolated using standard techniques, and then ten microgram samples were fractionated via electrophoresis on a 1.3% agarose gel containing 2.2M formaldehyde. The fractions were then transferred to a nitrocellulose membrane, labeled, and assayed in a hybridization assay following Van Snick, et al, J. Exp. Med. 169: 363 (1989), incorporated by reference.

[0047] The results indicated that the induction of TIF by ConA was not modified, and that IL-9 did not induce TIF RNA in PMA activated spleen cells.

Example 15

[0048] The expression of TIF mRNA in various cell lines was tested. In these experiments, murine cell lines were stimulated for at least one day, with a particular cytokine. Specifically, 9T7 is a T cell lymphoma, which responds to IL-2, IL-4 or IL-9. Cell lines TS3 and TS6 are derived from T helper cell clones, and proliferate in the presence of either IL-2 or IL-9. MC9 and LI38 are mast cell lines, which proliferate in the presence of either IL-3 or IL-9.

[0049] Following stimulation, total RNA was prepared using standard guanidium isothiocyanate lyses, and CsCl gradient centrifugation.

[0050] The 9T7 line was then analyzed by Northern blotting, as described in example 14, while the other lines were assayed using RT-PCR analysis, as described supra.

[0051] It was found that IL-9 upregulated TIF expression in T helper cells and mast cells, while IL-2 and IL-3 did not. The 9T7 cell line, however, showed roughly the same level of expression, regardless of the cytokine, indicating that IL-9 is not mandatory for TIF expression.

Example 16

[0052] The expression of TIF mRNA in B cell lines was then studied. The cell lines A20, 70Z/3, and BCL-1 are B cell leukemia cell lines which grow, in vitro, without cytokines. These cells were stimulated for 24 hours with IL-4 and IL-9 and total RNA was isolated, using standard methods. Expression was analyzed by RT-PCR which was carried out for 35 cycles, followed by blotting and hybridization, as described supra.

[0053] The results indicated that TIF expression is detectable in B cells, but is weakly upregulated at best in the presence of IL-9 and IL-4.

Example 17

[0054] Experiments were then carried out to study expression of the inventive molecules in T helper cell lines. TS2 and TS1 are known T helper cell lines, derived from T helper cell clones, which proliferate in the presence of either IL-9 or IL-2 (TS2), and either IL-9 or IL-4 (TS1). Specifically, TS1 or TS2 cells were grown in the presence of the listed cytokines for at least 10 days, after which RNA was extracted using known methods. Expression of the molecules was studied via RT-PCR (35 cycles), using the protocols described supra. In TS1 cells both IL-4 and IL-9 induce TIF expression, but IL-2 does not do so in TS2 cells.

Example 18

[0055] Expression of TIF mRNA in various mouse organs were studied. Total RNA was prepared from liver, kidney, heart, brain, intestine, spleen, thymus, lung, muscle and bone marrow, using standard guanidium isothiocyanate methodologies and CsCl gradient centrifugation. Forty cycles of RT-PCR were carried out, using the protocols described supra. Strongest expression was found in thymus tissue, while less intense signals were found in brain tissue, and weaker expression in the remaining tissues.

Example 19

[0056] The following experiments describe production of TIFα, in 293-EBNA cells.

[0057] Complementary DNA for TIFα was described supra. It was subcloned into a commercially available expression vector pCEP-4, in operable linkage with a CMV promoter. The resulting plasmids were transfected into 293-EBNA cells, using standard lipofectamine methods. Following transfection, the cells were incubated in a methionine free medium, supplemented with ³⁵S labeled methionine, for 24 hours. Supernatant was harvested, and run on an acrylamide gel, followed by electrophoresis. The gel was then dried and exposed to autoradiography for 1 day. A control was then run by transfecting cells with the same plasmid, in which the cDNA was cloned in the antisense direction.

[0058] A heterogenous band of about 25-30 kilodaltons was found from the cells transfected with TIF in the sense direction. Any discrepancies between the predicted molecular weight, the actual molecular weight in the system, and the heterogeneity, can be attributed to glycosylation. In a series of parallel experiments, cDNA encoding human TIF was expressed in the same way as the murine cDNA was expressed. With the exception of the change of the CDNA, all experimental parameters were the same.

Example 20

[0059] Further experiments were carried out to study production of TIFα in COS cells. Specifically, TIFα cDNA was subcloned into the plasmid pEF-BOS.puro described by Demoulin et al., supra, in operable linkage with the EF-1α promoter. The plasmid cDNA was transfected into COS cells, using the same lipofectamine method described supra. The cells were incubated in methionine free medium, supplemented with ³⁵S methionine for 24 hours, after which supernatant was treated as described in example 20, supra. Again, a heterogenous band of 25-30 kilodaltons was observed, as well as an 18 kilodalton band, which probably represents a non-glycosylated form of the molecule.

Example 21

[0060] In these experiments, it was discovered that TIF induces STAT activation in mesangial, neuronal melanoma, and hepatoma cells. It is known that when cytokines activate STAT factors, the factors dimerize, move from cytoplasm to the nucleus, and bind to target sequences in promoters. The details of the experiments follow.

[0061] Transfected 293-EBNA cells as described supra were used following incubation in normal medium for 48 hours, as were supernatant from the controls, also described supra. Samples of a mouse kidney mesangial cell line, (“MES13” hereafter), a rat pheochromocytoma cell line, (“PC12” hereafter), four different human melanomas (SK23, AUMA, NA-8mel and MULL), human heptaoma (HepG3) and rat hepatoma (H-4-II-K) were used. Cell samples (0.5×10⁶) were stimulated for 5-10 minutes in the presence of 1% of supernatant. Nuclear extracts were then prepared, in accordance with Demoulin et al., Mol. Cell. Biol. 16: 4710 (1996), incorporated by reference. In brief, cells were washed with PBS and then resuspended in 1 ml of ice cold hypotonic buffer for 15 minutes. (Buffer was 10 mM HEPES buffer, pH 7.5, with 10 mM KCl, 1 mM MgCl₂, 5% glycerol, 0.5 mM EDTA, 0.1 mM EGTA, 0.5 mM dithiothreitol, and 1 mM Pefabloc, 1 mM Na₃V₄, and 5 mM NaF). Cells were then lysed by adding 65 μl of NP-40, followed by vortexing. Nuclei were pelleted, by vortexing for 30 seconds at 14,000 rpm, followed by extraction in buffer supplemented with HEPES (20 mM), glycerol (20%), and NaCl (420 mM). Nuclear debris was removed by centrifuging for 2 minutes. DNA binding activity was determined in accordance with Demoulin et al., supra, using a ³²P labeled double stranded oligonucleotide called “GRR,” which contains the STAT binding site of the FcγRI genepromoter, i.e.:

5′ATGTATTTCC CAGAAA-3′ (SEQ ID NO: 13)

[0062] and

5′-CCTTTTCTGG GAAATAC-3′ (SEQ ID NO: 14)

[0063] corresponding to the upper and lower strands of the binding sites in the GRR probe. Briefly, 5 μl volume of nuclear extracts were incubated in binding buffer (12 mM HEPES, pH 7.6, 10 mM KCl, 0.5 mM EDTA, 2.5% glycerol, 0.1 mg of poly(dI-dC) per ml) for 5 minutes. Radiolabeled GRR probe (10⁵ cpm; approximately 0.5 ng) was added, and incubation was continued for 25 minutes before loading onto a non-denaturing polyacrylamide gel.

[0064] It was also noted that the complexes observed in MES13 cells, described supra, were partially overshifted by both anti-STAT5 and anti-STAT3 antibodies, showing that (i) the cells under examination were targets for TIF, and (ii) that STAT3 and STAT5 are major components of the complex activated by TIF. The difference in STAT profile, as compared to the profile in Example 12, supra, is attributable to the difference in cell source (human versus mouse). It was also observed that human TIF works on murine cells, and vice versa.

Example 22

[0065] This example details the isolation and cloning of a nucleic acid molecule which encodes human TIF. First, human peripheral blood mononuclear cells were prepared via standard density gradient centrifugation. Following this preparation, samples were cultured for 24 hours, at 3×10⁶ cells/ml, either with or without anti-CD3 monoclonal antibody (The antibody was the commercially available OKT3 mAb, used in the form of ascites fluid at {fraction (1/500)} dilution). This antibody was used because T cell derived cytokines are generally expressed only upon activation by e.g., CD3 specific antibodies.

[0066] Total RNA was isolated from these cells, using standard guanidine-isothiocyanate/CsCl ultra-centrifugation techniques. Following isolation, 10 μg samples of the RNA were reverse transcribed using an oligo (dT)15 primer.

[0067] Following preparation of cDNA, as outlined supra, samples which corresponded to 100 ng of total RNA were amplified, via PCR, using the following primers: 5′-AGCTGCTCAA CTTCACCCTG GA-3′ (SEQ ID NO: 15) 5′-CCACTCTCTC CAAGCTTTTT CA-3′ (SEQ ID NO: 16)

[0068] which are based upon a murine cDNA sequence, (i.e., SEQ ID NO: 7). The PCR conditions involved 30 cycles of amplification, with one cycle defined as 1 minute at 94° C., followed by 1 minute at 42° C., and then 2 minutes at 72° C. Amplification product was separated on an agarose gel, using standard methods, and then sequenced. The result indicated that fragments of the cDNA had been amplified. Hence, a second reaction was carried but, using the same materials except SEQ ID NO: 16 was replaced by SEQ ID NO: 17, i.e.:

5′-CAAGTCTACC TCTGGTCTCA T-3′

[0069] This second PCR reaction was carried out for 25 cycles, with one cycle being defined as 1 minute at 94° C., followed by 1 minute at 45° C., and then 2 minutes at 72° C. The amplification product was subjected to the same steps as the first one. Again, fragments of cDNA were amplified.

Example 23

[0070] Following preparation of amplification product, the 5′ end of cDNA was isolated by using standard, 5′-RACE techniques. In brief, first strand cDNA was prepared by using SEQ ID NO: 18 as a primer, i.e.:

5′-TGGCCAGGAA GGGCACCACC T-3′

[0071] This primer was based upon the sequence information obtained in accordance with example 22. In brief, the 5′-RACE method was carried out by combining 1 μg of total RNA, prepared as described supra, 2.5 pmoles of SEQ ID NO: 18, reverse transcriptase reverse transcriptase buffer, 2.5 μl of dNTP mix (10 mM), 2.5 μl of MgCl₂ (25 mM), and 2.5 μl of dithiothreitol (0.1 M). The reaction was carried out and, after completion, original RNA was removed via adding RnaseH, and Rnase TI. Any unincorporated dNTPs, as well as primer and proteins, were removed. The cDNA was tailed using terminal transferase, or “TdT.” This enzyme creates a 3′-binding site for the abridged anchor primer, as described infra. Tailing was carried out by combining the purified, first strand cDNA, TdT, buffer (10 mM Tris-HCl, 25 mM KCl, 1.5 mM MgCl₂), and 200 μM of dCTP.

[0072] Following the tailing reaction, PCR was carried out using

5′-TGGCCAGGAA GGGCACCACC T-3′ (SEQ ID NO: 19),

[0073] and 5′-RACE abridged anchor primer:

5′-GGCCACGCGT CGACTAGTAC GGGIIGGGIIGGGIIG-3′ (SEQ ID NO: 20).

[0074] The amplification involved 35 cycles (1 cycle defined as 1 minute at 94° C., 1 minute at 56° C., and 2 minutes at 72° C.). Following this, nested amplification was performed on 5 μl of a {fraction (1/100)} dilution of the amplification product, using SEQ ID NO: 19 and the abridged universal amplification primer:

5′-GGCCACGCGT CGACTAGTAC-3′ (SEQ ID NO: 21).

[0075] Amplification involved 30 cycles (1 cycle being defined as 1 minute at 94° C., 1 minute at 56° C., and 2 minutes at 72° C.). The resulting PCR product was cloned, following standard procedures, and sequenced.

[0076] These three protocols, i.e., the two experiments described supra which generated fragments, and the 5′-RACE PCR, also described supra, permitted alignment of the sequenced amplification product, to generate the complete sequence.

[0077] Following the alignment, oligonucleotides were generated which flanked the deduced open reading frame, i.e.:

5′-CCTTCCCCAG TCACCAGTTG-3′ (SEQ ID NO: 22)

[0078] and

5′-TAATTGTTAT TCTTAGCAGG-3′ (SEQ ID NO: 23).

[0079] These primers were used to amplify the entire open reading frame, using mRNA from CD3 specific mAb stimulated cells, as described supra. For amplification, 25 cycles (1 cycle being defined as 1 minute at 94° C., 1 minute at 56° C., and 2 minutes at 72° C.).

[0080] The complete sequence of the human cDNA is set forth at SEQ ID NO: 24.

[0081] As with the murine sequence, there are potential start codons at positions of SEQ ID NO: 24 which correspond to amino acids 1 and 13, as well as codons corresponding to methionine at amino acid positions 58, 85, and 92. The possible initiator codons correspond to proteins with calculated molecular weight of 19,998 daltons, and 18,735 daltons respectively (for 176 or 167 amino acids, respectively). As with the murine form of the protein, hydrophobic leader sequences are seen, indicating an N-terminal signal sequence of from about 20 to about 40 amino acids.

Example 24

[0082] These experiments detail work on the isolation of human genomic DNA corresponding to the cDNA discussed supra.

[0083] Based upon the cDNA sequences, primers were developed which correspond to nucleotides 51-70 and the complement of nucleotides 631-650 of SEQ ID NO: 24. PCR was carried out, using standard methodologies. Specifically, 10 ng of genomic DNA was used as a template, and 33 cycles of amplification were cararied out (one cycle of amplification being defined as 94° C. for 30 seconds, 50° C. for 30 seconds, and 72° C. for 5 minutes). Once a sequence was isolated, it was sequenced, and this is set forth as SEQ ID NO: 25. The sequence is about 4.8 kilobases in length, and is believed to contain the entire genomic sequence encoding the TIF molecule, lacking only the 5′ flanking region, the promoter, and the 3′ end.

Example 25

[0084] It was of interest to identify where the genomic DNA discussed supra was located in the human genome. In order to do this, two different approaches were taken. In the first, the sequence discussed supra, i.e., SEQ ID NO: 25, was labeled with a flourescent label, and then was used to probe the human genome via fluorescent, in situ hybridization (“FISH”) using standard methods.

[0085] In a second approach, a panel of radioactive hybrid clones were screened using the probe consisting of nucleotides 51-70 of SEQ ID NO: 24, and 5′-ATCAGATGGA TTACTGAATG-3′ (SEQ ID NO:26). PCR was carried out using 25 ng of genomic DNA as a template, for 35 cycles, where one cycle is defined as 94° C. for in minute, 55° C. for 1 minute and 72° C. for 2 minutes.

[0086] Both methodologies indicated that the gene is located at chromosome 12q15. Some work links diseases associated with asthma at this site. See, e.g. Nat. Genet. 15:389-392 (1997); Ober, et al, Hum. Mol Genet. 7(9):1393-1398(1998); Nickel, et al, Genomic 46(1):159-162(1997); Takahashi, et al, Genomics 44(1):150-2(1997); Barnes, et al, Genomics 37(1):41-50(1996), all incorporated by reference..

Example 26

[0087] These experiments describe the manufacture of antibodies which bind to the TIF protein. To make these, a peptide consisting of amino acids 40-61 encoded by SEQ ID NO: 7 was coupled to KLH carrier protein, using standard methods and a ratio of 1 mg peptide to 1 mg carrier protein. Subject animals (rabbits), were immunized 3 times, at 2 week intervals, with 150 μg of the complex. The immunogen was emulsified in Complete Freund's Adjuvant for the first injection, and then Incomplete Freund's Adjuvant for the next two.

[0088] A first bleed was performed one month after the last injection, and serum was prepared, following known methods.

[0089] The serum was then tested in a standard Western Blot. In brief, 10 μl of supernatant from cells transfected with either SEQ ID NO: 7 or SEQ ID NO:24 were separated via SDS-PAGE electrophoresis, and then blotted onto PVDF membranes. Antiserum was diluted to 1:500, and used in a standard Western Blot protocol, together with anti-rabbit antibody as the secondary antibody, and a commercially available detection kit.

[0090] It was found that the serum did, in fact, recognize the TIF protein.

[0091] In FIG. 1, the deduced amino acid sequences of murine and human TIF are set out. The high degree of homology is seen in the boxed regions.

[0092] The foregoing examples describe the invention, one aspect of which are isolated nucleic acid molecules, which encode TIF proteins such as those with the amino acid sequence of the protein encoded by the nucleotide sequence of SEQ ID NO: 7, 24 or 25. It will be appreciated by one of ordinary skill that the degeneracy of the genetic code facilitates the preparation of nucleic acid molecules which may not be identical to the nucleotide sequence of SEQ ID NO: 7, 24 or 25, but which encode the same protein. Of course, SEQ ID NOS: 7, 24 and 25 are preferred embodiments of this invention, but other embodiments are also a part of the invention. Genomic DNA, complementary DNA, and RNA, such as messenger RNA, are all to be included therein. Isolated nucleic acid molecules from other animal species, including other mammals, are also a part of the invention. A preferred aspect of the invention are isolated nucleic acid molecules whose complements hybridize to SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, or SEQ ID NO: 24 under stringent conditions. “Stringent conditions,” as used herein, refer, for example, to hybridization at 65° C. in buffer (3.5×SSC), 0.02% Ficoll, 0.02% polyvinylpyrrolidone, 0.02% bovine serum albumin, 25 mM NaH₂PO₄ (pH 7), 0.1% SDS, 2 mM EDTA, followed by a final wash at 2×SSC, room temperature and then 0.1×SSC/0.2×SDS at temperatures as high as, e.g., about 65° C. More stringent conditions, such as 0.1×SSC, can also be used. These nucleic acid molecules encode proteins of about 17-22 kD as determined by SDS-PAGE, which activates STAT proteins, such as STAT1, STAT3 and/or STAT5. In glycosylated form, these proteins can range from about 17 to about 30 kilodaltons, as determined by SDS-PAGE.

[0093] Also a part of the invention are expression vectors which include the nucleic acid molecules of the invention, operably linked to a promoter, so as to facilitate expression of the DNA. It is well within the skill of the artisan to prepare such vectors.

[0094] The vectors, as well as the nucleic acid molecules per se, can be used to prepare recombinant cells, be these eukaryotic or prokaryotic, wherein either an expression vector or the nucleic acid molecule itself is incorporated therein. E. coli cells, COS cells, CHO cells, etc., are all examples of types of cells which may be used in accordance with this aspect of the invention.

[0095] Proteins encoded by the above referenced nucleic acid molecules, preferably in isolated form, are another feature of this invention. By “protein” is meant both the immediate product of expression of the nucleic acid molecules, glycosylated forms of it, as well as multimeric forms, such as dimers, trimers, and so forth. Also a part of the invention are multimers, such as dimers, which contain at least one protein molecule of the invention, and at least one, different protein molecule. Preferably, this different protein molecule is a cytokine, such as IL-10. Also included as a feature of the inventions are constructs, such as fusion proteins, where all or a part of the proteins described supra are linked in some fashion, such as in a fusion protein, to at least one addtional protein or peptide, or amino acid sequence. The “fusion partner” may be, for example, a molecule which provides a recognizable signal, either directly or indirectly, such as a FLAG peptide, β-galactosidase, luciferase, and so forth. These fusion partners are preferably joined to the molecule which is described supra at the N- and/or C- terminus of the protein; however, it is to be understood that there are many techniques known for joining molecules to amino acids, and any and all of these methodologies can produce constructs which are a part of the invention.

[0096] The individual protein molecules of the invention, as noted supra, will preferably have a molecular weight of from about 17. to about 30 kilodaltons, as determined by SDS-PAGE. In multimeric forms, the molecular weight of the complex will, of course, vary, but the TIF molecules contained therein will each have a molecular weight of about 17 to 30 kilodaltons, as determined by SDS-PAGE.

[0097] The proteins preferably consist of at least about 120 and no more than about 200 amino acids. Preferably, the amino acids sequences consists of or comprises all or part of the amino acid sequences encoded by SEQ ID NOS: 7, 8, 9, 24 or 25. More preferably, the amino acid sequence contains all but about the first 40 amino acids encoded by said SEQ ID's. Even more preferably, it contains all but about the first 20 amino acids encoded by these sequences. Most preferably, the protein comprises amino acids set forth at SEQ ID NO: 27 or 28.

[0098] It will be appreciated by the skilled artisan that the proteins encoded by the above recited nucleic acid molecules are a feature of the invention, and may be used to produce antibodies, in accordance with standard protocols. Such antibodies, in monoclonal and polyclonal form, constitute a further feature of the invention as do fragments of said antibodies, chimeric forms, humanized forms, recombinant forms, and so forth. Also a feature of the invention are immunogens, comprising all or a part of the amino acid sequence protein molecules of the invention, preferably combined with an adjuvant, such as Complete or Incomplete Freund's Adjuvant. Portions of the protein sequences may be linked to other molecules, such as keyhole limpet hemocyanin, to render them more immunogenic. These antibodies can be used, e.g., to determine if the proteins of the invention are present. This is a further feature of the invention, as is now explained. It has been shown, in the examples, that the nucleic acid molecules of the invention were expressed in the presence of the IL-9. Hence, a further feature of the invention is a method to determine if IL-9 is or has been present, wherein one detects either the proteins of the invention, using antibodies for example, or mRNA using the nucleic acid molecules of the invention, as probes. The mRNA can be determined directly, or in the form of CDNA. Such probes may or may not be labeled, as a matter of choice for the user. Hence, one can determine, for example, if, following administration of IL-9, the cytokine is still efficacious, by determining if the nucleic acid molecule of the invention is present. This type of assay can be adapted, for quantitative studies, wherein one determines, for example, either if a cell is sensitive to IL-9, and if so, how sensitive it is. One can also use the proteins of the invention to phosphorylate STAT proteins such as STAT1, STAT3 and/or STAT 5. This in turn results in dimerization of the STAT protein, followed by migration to the nucleus to provoke the effect that these STAT proteins have on cells.

[0099] One could also use these molecules to test the efficacy of IL-9 agonists or antagonists when administered to a subject, such as a subject suffering from lymphoma, an immune system disorder such as an allergy, acquired immune deficiency syndrome, autoimmune diabetes, thyroiditis, or any of the other conditions described in, e.g, U.S. Pat. Nos. 5,830,454; 5,824,551, and pending application Ser. No. 08/925,348, filed on Sep. 8, 1997 now allowed, all of which are incorporated by reference. The molecules can also be used to mediate the role of IL-9 in these and other conditions. To elaborate, since IL-9 induces TIFs, the TIFs are useful as IL-9 activity mediators. Thus, a further aspect of the invention is a method to determine activity of endogenous IL-9, such as in situations where excess IL-9 activity is implicated, such as asthmas, allergies, and lymphomas. One can also block or inhibit IL-9 activity by blocking or inhibiting TIF or TIF activity, using, e.g., antisense molecules, antibodies which bind to TIF, or other antagonists of in these molecules. For example muteins of TIF, which bind to the TIF receptor but do not activate it, therby inhibiting IL-9 induced activity, are a feature of the invention. Examples of conditions which can be treated by the use of such TIF muteins are allergies, asthma, and so forth. Muteins in accordance with the invention can be made in accordance with, e.g., Weigel, et al, Eur. J. Biochem 180(2):295-300(1989) and Epps, et al, Cytokine 9(3):149-156(1997), both of which are incorporated by reference. Such muteins can be used in the treatment of asthma, allergies, or both. Further, it will be clear to the skilled artisan that the models set forth, supra, can also be used to screen for appropriate muteins/ The ability to regulate IL-9 activity is important in conditions such as those listed supra, as well as conditions such as apoptosis, including cortisol induced apoptosis, conditions involving the nuclear expression of BCL-3, since IL-9 is known to induce such expression, and so forth. “Antibodies,” as used herein, refers to any portion of an antibody which binds to TIF, including chimeric and humanized antibodies.

[0100] Another feature of the invention relates to the ability of the TIF type molecules of the invention to either promote regeneration or inhibit differentiation of tissue types on which the molecules are active. As was shown, supra, the TIF molecules target various cancer and normal cell lines (i.e., mesangial and neuronal cells, as well as melanoma and hepatoma cells). Hence, one can stimulate regeneration of tissue via, e.g., adding an amount of a TIF type molecule to a sample in need of regeneration of a tissue acted on by the TIF molecule. This approach can be used both in vitro, and in vivo. Similarly, antagonists of TIF may be added when the situation is one where the aim is to inhibit differentiation of a particular type of tissue, such as melanoma or hepatoma.

[0101] The genes which encode TIF, as noted in Example 25, supra, are located on chromosome 12. This chromosome is associated with asthma, as is known in the art. Hence, a further embodiment of the invention is a method for determining susceptibility to conditions such as, or related to asthma, by determining if aberrations, such as polymorphisms, deletions, additions, etc., are present at the site of the TIF gene. Such aberrations may be an indicia of susceptibility to, or of the presence of, asthma, an allergic condition, or one or more related conditions. The ability to detect aberrations in a DNA sequence is well known in the art, and such methods need not be set forth herein. Preferably, the aberration or aberrations is detected via standard techniques, such as PCR, using the methodologies and primers referred to supra.

[0102] Other features of the invention will be clear to the artisan and need not be discussed further.

[0103] The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, it being recognized that various modifications are possible within the scope of the invention.

1 29 1 24 DNA Mus musculus 1 agcactctcc agcctctcac cgca 24 2 12 DNA Mus musculus 2 gatctgcggt ga 12 3 24 DNA Mus musculus 3 accgacgtcg actatccatg aaca 24 4 12 DNA Mus musculus 4 gatctgttca tg 12 5 24 DNA Mus musculus 5 aggcaactgt gctatccgag ggaa 24 6 12 DNA Mus musculus 6 gatcttccct cg 12 7 1119 DNA Mus musculus 7 taaacaggct ctcctctcac ttatcaactg ttgacacttg tgcgatctct gatggctgtc 60 ctgcagaaat ctatgagttt ttcccttatg gggactttgg ccgccagctg cctgcttctc 120 attgccctgt gggcccagga ggcaaatgcg ctgcccgtca acacccggtg caagcttgag 180 gtgtccaact tccagcagcc gtacatcgtc aaccgcacct ttatgctggc caaggaggcc 240 agccttgcag ataacaacac agacgtccgg ctcatcgggg agaaactgtt ccgaggagtc 300 agtgctaaag atcagtgcta cctgatgaag caggtgctca acttcaccct ggaagacgtt 360 ctgctccccc agtcagacag gttccagccc tacatgcagg aggtggtacc tttcctgacc 420 aaactcagca atcagctcag ctcctgtcac atcagcggtg acgaccagaa catccagaag 480 aatgtcagaa ggctgaagga gacagtgaaa aagcttggag agagtggaga gatcaaggcg 540 attggggaac tggacctgct gtttatgtct ctgagaaatg cttgcgtctg agcgagaaga 600 agctagaaaa cgaagaactg ctccttcctg ccttctaaaa agaacaataa gatccctgaa 660 tggacttttt tactaaagga aagtgagaag ctaacgtcca tcatcattag aagatttcac 720 atgaaacctg gctcagttga aaaagaaaat agtgtcaagt tgtccatgag accagaggta 780 gacttgataa ccacaaagat tcattgacaa tattttattg tcactgatga tacaacagaa 840 aaataatgta ctttaaaaaa ttgtttgaaa ggaggttacc tctcattcct ttagaaaaaa 900 agcttatgta acttcatttc catatccaat attttatata tgtaagttta tttattataa 960 gtatacattt tatttatgtc agtttattaa tatggattta tttatagaaa cattatctgc 1020 tattgatatt tagtataagg caaataatat ttatgacaat aactatggaa acaagatatc 1080 ttaggcttta ataaacacat ggatatcata aaaaaaaaa 1119 8 7445 DNA Mus musculus 8 gtctatcacc tgcttaagat tcttctaatt tataaaaaaa actatttctt aaaatgaaaa 60 gcaaccagag cacgtattta tagcatggtg ttctgaccat gcaggtacag agtggaatgg 120 taagaggcgc tattatcagc attaaccaac atgttaatgt tttcttctgg caagcaaact 180 tgaaatctat gtcttaaaca atcttcaagc ctctaatata gtgctaacga ctggagtccg 240 ctgctgtcca acagagctct tgagcacgct ctcctctgtt tgcaatttta tgttctttga 300 tcgactcccc aacctctcac cttcggctcc tgatggccac ctttcaactt tctgcattta 360 tgaactccat gttttaatct ttttattaaa atattcacac aatcagtgtt tgtgcaagtc 420 tgtttcaccc acatgtatgt ctgtgcacca agtgctgcct ggtgcttgtg ggggcaagga 480 gcaggagagg gtgccctggc accggagtca cggatggttg tgagccacca tgaggatgct 540 gggagttaga cccaggtcct ccagaagtgc agcaaatgct cttaaccaca cgcaggcatt 600 tctctctcca gccccaacat gagtgctttt agattccacc tagaatagag atctgatggc 660 ttcactcact gccacctccc ctttgcatct ttctgccaag gaacaccaaa aagcaagaat 720 ccccacactg ctttcgctcc tcaagtctgc acctctcaac aggtcaagat tctccagtgt 780 ccctctaaca ctttccccag tgtccctcta acactttctc cagtgtccct ctaacacttt 840 ctccagtgtc cctctaacac ttttgatctc aattagctga ggggagaaag atctcacaca 900 gtgattttca tgacttcgcg ttctagtcta gatgtaggca tttgcgtgtc agtctagggt 960 aggcgtctgc tcccgctgct taggaaagac tttcctagtc tagttgtcag gtgctatctg 1020 ggattcagtg tacatacaat gcaaaaaatc ccagtatttt gtaaattctc ttcttcaact 1080 atccatctat atagtatgtt attgtaggct catttaaaaa taatattttg agacttatgc 1140 ttgcacaagt aaaatgtcag agaattagca aatgtatagt attattttat tttaaaaaaa 1200 tctatgctta aaatgtctat tagattgttc actaccgata tttccaaact taacttgacc 1260 ttggctatga tttcaacctt tgtatttgca tctaccataa cagtctctga accagaacat 1320 tctgtggcaa tgggagctgt gaagaaagcc aacattctta ttaaaaaaaa aaaacagcta 1380 gttatagttt aggattccat atactaaaaa aaatagagat ataattattt taaaaattga 1440 aataatctcc aagttttcat tatggcttat ttcaaagcac agaatatagg acacgggtct 1500 tttatttctg gtcacttcta aagagataag aatctatgaa gttggtggga aaatgagtcc 1560 gtgaccaaaa cgctgactca atagctacgg gagatcaaag gctgctctac tcaatcagaa 1620 tctactacgg caaagccatg gctttctttg aaaaccgtgt ttagaagatt tctgggattt 1680 gtgtgcaaaa gcaccttgtt ggccctcacc gtgacgtttt agggaagact tcccatctct 1740 caaggtggga aggcttggag gtggtgtctt gtggcctcct atggtggtta ggtacttctc 1800 agaagacagg actggaaatt agataatgtc tgatgtcata tcattcacaa taccaaaaaa 1860 accctggtgt cccgatggct ataaaagcag caacttctgc ctctcccatc acaagcagag 1920 acacctaaac aggtaagcac tcagacctct acagacaatc atctgcttgg taccatgcta 1980 cccgacgaac atgctcccct gatgtttttg ccttttgctc tctcactaac aggctctcct 2040 ctcacttatc aactgttgac acttgtgcga tctctgatgg ctgtcctgca gaaatctatg 2100 agtttttccc ttatggggac tttggccgcc agctgcctgc ttctcattgc cctgtgggcc 2160 caggaggcaa atgcgctgcc cgtcaacacc cggtgcaagc ttgaggtgtc caacttccag 2220 cagccgtaca tcgtcaaccg cacctttatg ctggccaagg aggtacagct gcatctcttt 2280 ctctccatac cgccttgcca ttttctctga agcacttgca aactctttag gggcgcttta 2340 tctccgcagg tctcactacc tatgttttct gtctctttag agactcttta aggactgggt 2400 ctttttctat ttctatttca aggtctcagg accatttcct atcttggcct tcaggacaca 2460 tatactgaat tttatctaca gaggcgcatt tagaaagcca cccacgactg caatactttc 2520 catttctctg tgctctcttc tgaactcata ctctcttggc tactcctgag acccactgcg 2580 gacatacatc tctacttaca ggcttttctt ccatctcctt gtcacccagg cacttagggt 2640 tttctctctt tcaggccagc cttgcagata acaacacaga cgtccggctc atcggggaga 2700 aactgttccg aggagtcagt gtaagtcctc actgtgatga gcagggctag ctgcgggagc 2760 tggtggaccc tctgggatag tctgacgtat gacccctgct gcttcttgtc tacctgcagg 2820 ctaaagatca gtgctacctg atgaagcagg tgctcaactt caccctggaa gacgttctgc 2880 tcccccagtc agacaggttc cagccctaca tgcaggaggt ggtacctttc ctgaccaaac 2940 tcagcaatca gctcagctcc tgtgtaagtc tgactctggc tacctatgct cctctctctt 3000 cctcttctat tccagtaaga acccgaggtc ctgccctctc tctcttcaca agagtgagga 3060 gggcctcagc accaccacca tcataggcca cttgaaatag gtcacaaagg ctttggcttc 3120 aattgagtaa tactttgagt ttgtatgagt gaagctttat ttgttttatc catggaaaga 3180 aatcaactca aattctgtag gatgagaaag atgttgggaa cgaaaaaagg cctagataga 3240 gaaacagatc tgctgagtat agtacttatg gggggagcag ggggcgatat ccactgagta 3300 caagtacttg tggggagaga aatccactga gtacaagtac ttgttggcat ggagatccac 3360 tgagtacaag tacttgtggg gggagggaat ggcacagagc aaaagttgaa gggaaggaag 3420 atggagaggc ctcatggttg ggggtgtgaa aggtcactcc ttttccatgt gatggagagt 3480 taagaaaaac cagtgtgtga gtttgatgtc ttcagacacc cccaactatg aaacatatcc 3540 acgaggagcg ggcagactgt gggagacctg gcatttaggg aaggcgcggc ttttcacacg 3600 agaaacttta tgctcatctc ttgtgctaca ctcccacctt tgatgaggtt cagctcaggt 3660 ttcgtttcta ccgttcttgc tactggtgga aacttcagta ggattcccca aagacgagga 3720 cagctcttct gtaagggagg gacctggatt tcagtgtcct agagaacgaa atagctcaga 3780 gaatctaggt caacgtgaaa tctaggtcac agcgggcaaa aatgactgaa cgcctctatt 3840 ccaggtgaac ggtcacgtgc ctcagatata ctgaggtatt gggctcccac cggataagat 3900 tctgttagtg agtctgcttt tattttgcag cacatcagcg gtgacgacca gaacatccag 3960 aagaatgtca gaaggctgaa ggagacagtg aaaaaggtac tattggcaag ccacaatact 4020 aagccattca gtaggagacg tggggatttc tttctctgct tcccagtccc ttctactttg 4080 taacatttta tttgacttgt ctactatctg gtccattact cgcttagctg cacctgtatc 4140 tagctgggtc tatagatctt tcaatctgtg tctaaatttg taagtcacaa ttctggagct 4200 agcagaaagc ttagctcagc cagtctcatg agcacttgct cggaggatgg cttgtgacag 4260 agtcaatgct agaagacagc atccctgatt cccagctctg cacttgccta gtggccatgt 4320 gtaattactt tggcttgatt aagtatttgg gaaagccagt tcccacggac ctacataatc 4380 tgaagaacca tgcattgaaa actagaaagc tgggcacaaa cttactagag atgatttttg 4440 agctcattaa acggatgctc tgaaatgtgg caaaatcaac ccagaataac aacaaaagag 4500 ctggatttgc aaataggaca agtatttaga atcactggta ttaatagcta tcatcttaat 4560 taaaatatag ggcctatata tatatttaag attaaacaca agagtggata gcctcccaat 4620 ttacttggcc tggtttcaaa agagtaaaaa tatcagtcat ggattaatta tagtgtcatg 4680 aaagtatgag atggaaaccc tttccttact ttttaccttc atttcttagt tttttttttc 4740 ttcacaccct gatcaagcca ctagtaagca cctatctgct gtgagctatt atatgacttt 4800 acagcaaaca acattgctgt gtggcctctt tggggaaggg aacaggatag caggaggctc 4860 aggctagcaa gtctgacttg ccctaaagcc agaggcatgg ttgatagcag agaaagtgag 4920 gctcttcgca agtgggtgtg cttaagtaat cagaaacagg aaggctccgg ttgatggaat 4980 tatcagtaag atatctaccc ttatctcctt ctatcgaacc taaatcgtct ctttttcttg 5040 tgtgtaggct gataaacaca cttgttttct tttgagtgtt catggctttg tagattttta 5100 gtgctctgcc agttcttgtt agagggtttg ttaccttgac acctgggctt ggatgttagc 5160 atgccaaagg cacacacttc tgaatgcctg tgtaaaaggt tattattcat ttactttgtc 5220 tttggaaagg tgaagcgtgt gtgagaaaga actcacagga gatgtgttct ctgtaggaaa 5280 actttttttt tccccttaaa tgcctataat ccactttcag tcaactttga cttttatacc 5340 atgctgtcac atgaaagagt gtttaggccc gctctcatgg ctctgggaaa agcaccaata 5400 ggggaaggaa tgttatgctg agaaatctga ccggcaggga aactggtcag agctcccccg 5460 aagaccacca caggtgttaa gtaggaacag tccagggtgg gctcatgtaa tagaatggaa 5520 cagagcgagg gaagataagc tacaaagttt catagggtcc ggagtcttaa agatacaaaa 5580 tagctgcttg ggcttcataa caaaggaagt ctgggaaggc agcaagtgag agggaaatgg 5640 aaagggaaaa aacagaatgt agaggacttg aacagctaca aatcctctac cagacgattt 5700 ttcttggaac aatctagaag gtagtggatt aggtgattgc agggggactt gctttgccat 5760 ttgaatctgg gtttttgtct ctccattgag gttgaaagcg tcaccctttt taccctcgaa 5820 tggaggagga aagaaggggt gttatgactc ctacctggag ttttactagt ttacgcaatg 5880 gaacagacac tcgggacctc ctcttgacaa aaaaaatgga aacctgttgt ttgtcttgtt 5940 tgttcttttg ttaagaaagc acaggcaaag cccgaccaca tgggttgaat gtgggtcttt 6000 gagtcaaggc ttttgagttg agcactcatc aatagttgat catggtcagg tggagggcta 6060 cctgtcaggc cgagccctgc tggcttcgca cttaacatct ccaggtctca gtatcacttc 6120 ctgctactta gcacagttag gagttgagca aacctttttt tccaaccccc actaaaattt 6180 aattgacaaa agactgtgta atttgtggga tacagtgtga taattgatct atgtgtgcat 6240 tgtgcaaggt tcaataagat agattaatag gcccatcaac agctttatgg gtgtgaaatg 6300 caagtaatat aggtagatgc ctgtggtgtc cttaggtcag aaaggcatga ttttaaggtc 6360 ttgggcaaat catattatac tcatgctaaa aatacattat gttgattatt aatcttttag 6420 agaaggctga tacttggttt tggtgctcag caagcaaatg tcaccagctc tttctaactg 6480 gtaccacttt agaaaatgct acctgtgctc aaattggttt gtattcttat tttcatagct 6540 tggagagagt ggagagatca aggcgattgg ggaactggac ctgctgttta tgtctctgag 6600 aaatgcttgc gtctgagcga gaagaagcta gaaaacgaag aactgctcct tcctgccttc 6660 taaaaagaac aataagatcc ctgaatggac ttttttacta aaggaaagtg agaagctaac 6720 gtccatcatc attagaagat ttcacatgaa acctggctca gttgaaaaag aaaatagtgt 6780 caagttgtcc atgagaccag aggtagactt gataaccaca aagattcatt gacaatattt 6840 tattgtcact gatgatacaa cagaaaaata atgtacttta aaaaattgtt tgaaaggagg 6900 ttacctctca ttcctttaga aaaaaagctt atgtaacttc atttccatat ccaatatttt 6960 atatatgtaa gtttatttat tataagtata cattttattt atgtcagttt attaatatgg 7020 atttatttat agaaacatta tctgctattg atatttagta taaggcaaat aatatttatg 7080 acaataacta tggaaacaag atatcttagg ctttaataaa cacatggata tcataaatct 7140 tctgtcttgt aatttttctc cctttaatat caacaatacc atcatcatca tcattaccca 7200 atcattctca tgatttcatg cttgacccat attatactgt taaagttggt tcctggaggc 7260 ctgtggtttt gtgtgtgttg tgtgtgtgtg tggggttatg catgtgaaag ccagagatgg 7320 atattaggtg ttcttctcta tcagtctttg ccttattatt tgagacaggg tctgtcactg 7380 aacctgtagc taggctggcc aacaagctct attaattttt tttaagatta attaattatg 7440 tgtat 7445 9 1111 DNA Mus musculus 9 aacaggctct cctctcagtt atcaactttt gacacttgtg cgatcggtga tggctgtcct 60 gcagaaatct atgagttttt cccttatggg gactttggcc gccagctgcc tgcttctcat 120 tgccctgtgg gcccaggagg caaatgcgct gcccatcaac acccggtgca agcttgaggt 180 gtccaacttc cagcagccgt acatcgtcaa ccgcaccttt atgctggcca aggaggccag 240 ccttgcagat aacaacacag acgtccggct catcggggag aaactgttcc gaggagtcag 300 tgctaaggat cagtgctacc tgatgaagca ggtgctcaac ttcaccctgg aagacattct 360 gctcccccag tcagacaggt tccggcccta catgcaggag gtggtgcctt tcctgaccaa 420 actcagcaat cagctcagct cctgtcacat cagtggtgac gaccagaaca tccagaagaa 480 tgtcagaagg ctgaaggaga cagtgaaaaa gcttggagag agcggagaga tcaaagcgat 540 cggggaactg gacctgctgt ttatgtctct gagaaatgct tgcgtctgag cgagaagaag 600 ctagaaaacg aagaactgct ccttcctgcc ttctaaaaag aacaataaga tccctgaatg 660 gactttttta ctaaaggaaa gtgagaagct aacgtccacc atcattagaa gatttcacat 720 gaaacctggc tcagttgaaa gagaaaatag tgtcaagttg tccatgagac cagaggtaga 780 cttgataacc acaaagattc attgacaata ttttattgtc attgataatg caacagaaaa 840 agtatgtact ttaaaaaatt gtttgaaagg aggttacctc tcattcctct agaagaaaag 900 cctatgtaac ttcatttcca taaccaatac tttatatatg taagtttatt tattataagt 960 atacatttta tttatgtcag tttattaata tggatttatt tatagaaaaa ttatctgatg 1020 ttgatatttg agtataaagc aaataatatt tatgataata actatagaaa caagatatct 1080 taggctttaa taaacacatg aatatcataa a 1111 10 21 DNA Mus musculus 10 ctgcctgctt ctcattgccc t 21 11 21 DNA Mus musculus 11 caagtctacc tctggtctca t 21 12 20 DNA Mus musculus 12 gacgcaagca tttctcagag 20 13 16 DNA Homo sapiens 13 atgtatttcc cagaaa 16 14 17 DNA Homo sapiens 14 ccttttctgg gaaatac 17 15 22 DNA Homo sapiens 15 agctgctcaa cttcaccctg ga 22 16 22 DNA Homo sapiens 16 ccactctctc caagcttttt ca 22 17 21 DNA Homo sapiens 17 caagtctacc tctggtctca t 21 18 21 DNA Homo sapiens 18 tggccaggaa gggcaccacc t 21 19 21 DNA Homo sapiens 19 tggccaggaa gggcaccacc t 21 20 36 DNA Homo sapiens 24,25,29, 30,34,35 n is inosine 20 ggccacgcgt cgactagtac gggnngggnn gggnng 36 21 20 DNA Homo sapiens 21 ggccacgcgt cgactagtac 20 22 20 DNA Homo sapiens 22 ccttccccag tcaccagttg 20 23 20 DNA Homo sapiens 23 taattgttat tcttagcagg 20 24 690 DNA Homo sapiens 24 tgcacaagca gaatcttcag aacaggttct ccttccccag tcaccagttg ctcgagttag 60 aattgtctgc aatggccgcc ctgcagaaat ctgtgagctc tttccttatg gggaccctgg 120 ccaccagctg cctccttctc ttggccctct tggtacaggg aggagcagct gcgcccatca 180 gctcccactg caggcttgac aagtccaact tccagcagcc ctatatcacc aaccgcacct 240 tcatgctggc taaggaggct agcttggctg ataacaacac agacgttcgt ctcattgggg 300 agaaactgtt ccacggagtc agtatgagtg agcgctgcta tctgatgaag caggtgctga 360 acttcaccct tgaagaagtg ctgttccctc aatctgatag gttccagcct tatatgcagg 420 aggtggtgcc cttcctggcc aggctcagca acaggctaag cacatgtcat attgaaggtg 480 atgacctgca tatccagagg aatgtgcaaa agctgaagga cacagtgaaa aagcttggag 540 agagtggaga gatcaaagca attggagaac tggatttgct gtttatgtct ctgagaaatg 600 cctgcatttg accagagcaa agctgaaaaa tgaataacta accccctttc cctgctagaa 660 ataacaatta gatgccccaa agcgattttt 690 25 4797 DNA Homo sapiens 25 tgcacaagca gaatcttcag aacaggttct ccttccccag tcaccagttg ctcgagttag 60 aattgtctgc aatggccgcc ctgcagaaat ctgtgagctc tttccttatg gggaccctgg 120 ccaccagctg cctccttctc ttggccctct tggtacaggg aggagcagct gcgcccatca 180 gctcccactg caggcttgac aagtccaact tccagcagcc ctatatcacc aaccgcacct 240 tcatgctggc taaggaggta tacatctcaa tcctgctctt tctcgttgga tctacttgga 300 atccaaatag ttcttaaact tttcttcaga gcatctctaa gagctttagg aacccactgt 360 ttatccctga gggtagataa attttctgtt ttttcagaga ctctttggga atctggcttt 420 ttttttttct tgaacttctt ccttccattt tggcctttat gatacatatg atgaattttt 480 cccaaagagc ggccattcag taatccatct gatgattttt ttttccttta tgcctctgtg 540 cattgttcta aactcatgca cacatctgaa ttctgctttt agtctttatg atgttgctct 600 ggggagacgg gatggggcac atgtctatgt ataaattttt tttctatttg ctcaatgtcc 660 agacccttag tcttttcttc tcttccaggc tagcttggct gataacaaca cagacgttcg 720 tctcattggg gagaaactgt tccacggagt cagtgtaagc tacagttgtg acgaacaggg 780 ccgtgtgccg tccatgggta cttggggtgg tggtgatgat ggtttaggtc ttatccctta 840 tgaccctttc tgtttccctt ccacctgcag atgagtgagc gctgctatct gatgaagcag 900 gtgctgaact tcacccttga agaagtgctg ttccctcaat ctgataggtt ccagccttat 960 atgcaggagg tggtgccctt cctggccagg ctcagcaaca ggctaagcac atgtgtaagt 1020 tcagctctca gcctatgccc acctacccct ccttccctcc ttccacagag acccccttac 1080 cccaactctc tctccttccc cctaccccta agctagcagg aagaagtgtc ttggcagcag 1140 tgttatcagg agtcatttgg gatcatagag tatttgcttt tgctttgact gagtcacatc 1200 ttgagtttat agtggtgaat ggggtctgga acttaagtgt acagaagccg cattggtttg 1260 tcttcggaaa aaaggcaact caggttgcgt aagatgagaa aggtgttggg aaaacatcta 1320 gctgtggaaa tggatccatt gagtctaagt tgttgagggg aggggatggc atggagagaa 1380 attagaagag aaagtgggaa atgggaaggc ttaaagtcgg tggtgggtcg gcagactgtt 1440 gccctgttga tgtcatggga agccacaaaa tcggaggcgt gtgaacttga tgccgctgaa 1500 catttgaaac tatgaaaaaa agtttgagtg gagtgggccc agtaaaaggc cctaggactt 1560 actgaagagg gcttaatttt cacatgagat gttttatgta catttcttgt tctaagcatg 1620 caattttctg gagatacgat tgaggtttta ttccttacag aatttgcata aactactccg 1680 ctctttccac aaatgcaaac ctcagtagga tttcccaaag atgaagagag gtctcttgta 1740 agggaagtga ctggattctg gcgtccaagg gaattcaaga gctcaggaaa tctaggtcac 1800 tgttgaaatc taggtcattg tgggcaaaat tactaagagc tttaattcca ggtgaattgt 1860 actgtacctc catgggtgtg gaggttcata aagtttcagc acaacattaa gatagttatg 1920 cttgttattg ttttatagca tattgaaggt gatgacctgc atatccagag gaatgtgcaa 1980 aagctgaagg acacagtgaa aaaggtagga ctgataactg tcaatgctaa gtcatgcaat 2040 aggagagaca aatgttgttt ttctttcctt tctttcttcc catcactttg tgatttttca 2100 cttgattctc ctaccaccag ggcgattact ttggtgtctg tgtatgtaga tatatctata 2160 tatctagatg tcagtttcca aatcttgcaa attgtagaat tctagaactg gttgggatct 2220 tagcttgtct agtcacataa cctcagattc tggggatggt cagtggcaga gatagggcta 2280 gaatgcaggt ctcctgaatc ccaagccagc acttttcccg gtggtgatac agattagttt 2340 tggtaccatt aattcttagg gaaatttcag attcctattg actcatgtaa tctgaagaag 2400 tacttgttta aaaacagaaa aatgcctatg ggcaaattta tttgaagtca tttttgaagt 2460 cattaatgca ttgctttgaa acttggaaga ataaactcag aacaatgaga aaagagctgg 2520 acttgcatat agggctaatt tctggagtaa taaacactta ttttgaatta tcataatatc 2580 tatcagatat tgattatagt ttaaaagcaa gagcagacaa ccccgatctc ttttatacag 2640 gttcaaatag agtaaaaata ttagtaagag atttattata gttaaatgga agtctgaatt 2700 ggtaagcttt tttttcttcc tctctcccat caagaccttc cattctagtt tcttccttca 2760 ctccctcaac aaatccctag ggagcattta tccatggtgg gctggtgtac atttctatag 2820 tgaatgatac catcatgtgg cctatttggt gaaaagaaca acaatggaag gcttagacta 2880 acaatagtga ctcaccccaa aaccggagga atgattagga gcagtgaaag tgacgctctt 2940 gcaagcaggt acaactaaat actcagaaac atgaaggctc cagttgatgg aattttcagt 3000 aacaagctta accttaattc cccctttttc cctcttgact ttttaaaaaa gcgtttcttc 3060 ctgagcatca tttaatgagt gtgactgttt cttcctttga taattgaagg ctttgtagtt 3120 ttaaattgtg aagcccagtt ctcttgttat agaactatta tctagacatg gagggctgaa 3180 tgttagcatg ccacagacaa ggcatgcttt acacatcttg cttaaaaaat tactgatttc 3240 atcttgcttg ttgtctttag aaaagtgaag tgtgagagag gagaatctca tggtgatctg 3300 tgtgattttc aagaccttta atccattttg aaagaatcaa tttcatattt gcaatgggtt 3360 gccatgtgga agagtgatta tgcttttttg ctggtagctt cagaaagcac aggagggaga 3420 gcaatgttgt tcagagaaag atcaacagga ggagaaactg tcagagctgt ctgaaatagg 3480 gtggttttgg gaggcattaa ttccctctcg ttgggggtaa aagcagaacg caggttggta 3540 gtaaaatgca tgacagacag taggggacga taaactttaa aattctttat agtcttggag 3600 tctttgagat agaaaagaat atctttttgg ccttatgtca aaagaagtat ggaaaggtga 3660 aagggcggaa gaaagcagga aaaggaagaa ccatgtatta tatagaggac aatggtgaca 3720 aggtttttct tgaaataatg caaatatgat agattagagg aatttcagta gggaatgctt 3780 ttcacttgaa tttgggtttc ctcttcgatt aagtttggga tcctcatctg catttgactt 3840 ggagagagaa agaatgaatg ttaggaccta tatctggttt tctattaact aaagcaagtg 3900 gaaaagactt atttggtatt tttcccacaa aagtgaaaac ttttctttta ctgtttgtca 3960 aaaaggtgga aatagaaaaa gccttaatgt attggtgaat acatggttca aagtcatttg 4020 agtagagatg ttttaaatca ggagtgtcca atcatttggc ttccctggac caccttgaaa 4080 gaattgtctt ggtacacaca taaaatacaa gaacaatagc tgatgagcta aaaaagtcca 4140 tgcataaatc tcatactgtt ttaagaaagt ttatgaattt ctgttagggt gcattcaaag 4200 ctgtcctggg ccatgtgcgg cctgtgggct gcaggttgga caagctcctt ataagtaatc 4260 tgtcatagat agttttggag ctgcaaaaca ggccaaggca taatgggtgg cactcgggat 4320 cccccagatc ccagcctcac ttcagtctcc ttgctctggt taagaagggg tggtcaactc 4380 tctgcccagc ttttaaacag cttcattagt gtgaggtgca cctgaaattg atgcctgctg 4440 gtggcctctc agtccagaga gccgtcattt taagctcttt ggcaaatcat acaatactaa 4500 agggatatta ctatgaatgt tttacaaatg cttaaaactc ggtttctgtc tccatcaacc 4560 taatcttgca atttctaatt tgttcacttt agaaaacatg gcataaatgc tcaaatactt 4620 ttgcattctt attttcacag cttggagaga gtggagagat caaagcaatt ggagaactgg 4680 atttgctgtt tatgtctctg agaaatgcct gcatttgacc agagcaaagc tgaaaaatga 4740 ataactaacc ccctttccct gctagaaata acaattagat gccccaaagc gattttt 4797 26 20 DNA Homo sapiens 26 atcagatgga ttactgaatg 20 27 179 PRT Mus musculus 27 Met Ala Val Leu Gln Lys Ser Met Ser Phe Ser Leu Met Gly Thr Leu 1 5 10 15 Ala Ala Ser Cys Leu Leu Leu Ile Ala Leu Trp Ala Gln Glu Ala Asn 20 25 30 Ala Leu Pro Val Asn Thr Arg Cys Lys Leu Glu Val Ser Asn Phe Gln 35 40 45 Gln Pro Tyr Ile Val Asn Arg Thr Phe Met Leu Ala Lys Glu Ala Ser 50 55 60 Leu Ala Asp Asn Asn Thr Asp Val Arg Leu Ile Gly Glu Lys Leu Phe 65 70 75 80 Arg Gly Val Ser Ala Lys Asp Gln Cys Tyr Leu Met Lys Gln Val Leu 85 90 95 Asn Phe Thr Leu Glu Asp Val Leu Leu Pro Gln Ser Asp Arg Phe Gln 100 105 110 Pro Tyr Met Gln Glu Val Val Pro Phe Leu Thr Lys Leu Ser Asn Gln 115 120 125 Leu Ser Ser Cys His Ile Ser Gly Asp Asp Gln Asn Ile Gln Lys Asn 130 135 140 Val Arg Arg Leu Lys Glu Thr Val Lys Lys Leu Gly Glu Ser Gly Glu 145 150 155 160 Ile Lys Ala Ile Gly Glu Leu Asp Leu Leu Phe Met Ser Leu Arg Asn 165 170 175 Ala Cys Val 28 179 PRT Homo sapiens 28 Met Ala Ala Leu Gln Lys Ser Val Ser Ser Phe Leu Met Gly Thr Leu 1 5 10 15 Ala Thr Ser Cys Leu Leu Leu Leu Ala Leu Leu Val Gln Glu Gly Ala 20 25 30 Ala Ala Pro Ile Ser Ser His Cys Arg Leu Asp Lys Ser Asn Phe Gln 35 40 45 Gln Pro Tyr Ile Thr Asn Arg Thr Phe Met Leu Ala Lys Glu Ala Ser 50 55 60 Leu Ala Asp Asn Asn Thr Asp Val Arg Leu Ile Gly Glu Lys Leu Phe 65 70 75 80 His Gly Val Ser Met Ser Glu Arg Cys Tyr Leu Met Lys Gln Val Leu 85 90 95 Asn Phe Thr Leu Glu Glu Ile Leu Phe Pro Gln Ser Asp Arg Phe Arg 100 105 110 Pro Tyr Met Gln Glu Val Val Pro Phe Leu Ala Arg Leu Ser Asn Arg 115 120 125 Leu Ser Thr Cys His Ile Glu Gly Asp Asp Leu His Ile Gln Arg Asn 130 135 140 Val Gln Lys Leu Lys Cys Thr Val Lys Lys Leu Gly Glu Ser Gly Glu 145 150 155 160 Ile Lys Ala Ile Gly Glu Leu Asp Leu Leu Phe Met Ser Leu Arg Asn 165 170 175 Ala Cys Ile 29 5935 DNA Homo sapiens 29 gaattcaagt ccacatgcaa tcaatccgaa tactttgtaa attctcttct tcaaatatcc 60 atctatatag tataagttat tgtaggatca tttaaaaata atgttttgag acttatgttt 120 gcacaagtaa aatgtcagag agaattagca aatgtatagt attattttat tttaaaaaat 180 ctatgcttaa aatgtctatt agattgttca ctactgacat ttccaaactt aacttgacct 240 tggctatgat ttcaaccttt gtatttgcat ctaccataac tgtgtgctca cttaccatgc 300 tatccgacga gcatgttccc ctgatgtttt tgccttttgc tctctcgcta acaggctctc 360 ctctcagtta tcaacttttg acacttgtgc gatcggtgat ggctgtcctg cagaaatcta 420 tgagtttttc ccttatgggg actttggccg ccagctgcct gcttctcatt gccctgtggg 480 cccaggaggc aaatgcgctg cccatcaaca cccggtgcaa gcttgaggtg tccaacttcc 540 agcagccgta catcgtcaac cgcaccttta tgctggccaa ggaggtacag ctgcatctct 600 ttctctccat accgccttgc catttctctg aagcacttgc aaactcttta ggggcgcttt 660 atctccgcag gtctcactac ctatgttttc tgtctcttta gagactcttt aaggactgga 720 tctttttcta tttctatttc aaggtctcag gaccatttcc tatcttggcc ttcaggacac 780 atatactgaa ttttatctac agaggcgcgt ttagaaagcc acccacgact gcaatacttt 840 ccatcctgtt gtgctctctt ctgaactcat actctcttgg ctactcctga gacccactgc 900 ggacatacat ctctacttac aggcttttct tccatctcct tgtcacccag gcacttaggg 960 ttttctctct ttcaggccag ccttgcagat aacaacacag acgtccggct catcggggag 1020 aaactgttcc gaggagtcag tgtaagtcct cactgtgatg agcagggcta gctgcgggag 1080 ctggtggacc ctctgggata gtctgacgta tgacccctgc tgcttcttgt ctacctgcag 1140 gctaaggatc agtgctacct gatgaagcag gtgctcaact tcaccctgga agacattctg 1200 ctcccccagt cagacaggtt ccggccctac atgcaggagg tggtgccttt cctgaccaaa 1260 ctcagcaatc agctcagctc ctgtgtaagt ctggctctgg ctacctatgc tcctctctct 1320 tcctcttcta ttccagtaag aacccgaggt cctgccctct ctctcttcac aagagtgagg 1380 agggcctcag caccaccacc atcataggcc acttgaaata ggtcacaaag gctttggctt 1440 caattgagta atactttgag tttgtattag ttaagcttta tttgttttat ccatggaaag 1500 aaatcaactc aaattctgta ggatgagaaa gatgttggga acgaaaaaag gcctagatag 1560 agaaacagat ctgctgagta cagtacttat gggggggggg ggcagggggc gatatccact 1620 gagtccaagt acttgttggg agagaaatcc actgagtaca agtacttgtg ggggaaggaa 1680 tggcacagag caaaagttga agggaaagag gaagatggag aggcctcaat gttgggggtg 1740 tgaaaggtca ctcctttttc catgtgatgg agagttaaga aaaatcagtg tgtgagtttg 1800 atgtcttcag acaccccaac tatggcagac tgtgggagac ctggcattta gggaaggcgc 1860 ggcttttcac acgagaaact ttatgctcat ctcttgtgct acactcccac ctttgatgag 1920 gttaagctca ggtttcgttt ctaccgttct tgctactggt ggaaacttca gtaggattcc 1980 ccaaagacga ggacagctct tctgtaaggg agggacctgg atttcagtgt cctagagaac 2040 gaaatagctc agagaatcta ggtcaacgtg aaatctaggt cacagcgggc aaaaatgact 2100 gaacgcctct attccaggtg aacggtcacg tgcctcagat atactgaggt attgggctcc 2160 caccggataa gattctgtta gtgagtctgc ttttattttg cagcacatca gtggtgacga 2220 ccagaacatc cagaagaatg tcagaaggct gaaggagaca gtgaaaaagg tactattggc 2280 aagccacaat actaagccat tcagtaggag acgtggggat ttctttctct gcttcccagt 2340 ctcttctact ttgtaacatt ttctttgact tgtctactgt ctggtccatt actcacttag 2400 ctgcacctgc atctagctgg gtctatagat ctttcaatct gtgtctaaat ttgtaagtca 2460 caattctgga gctagcagaa agcttagctc agccagtctc atgagcactt gctcggagga 2520 tggcttgtga cagagtcaat gctagaagac agcatccctg attcccagct ctgcacttgc 2580 ctagtggcca cgtgtaatta ctttagcctg attaagtatt tgggaaagcc aattcccacc 2640 gacctacata atccgaagaa gcatgcattg aaaactagaa agctgggcac aaacttacta 2700 gagatgattt ttgagctcat taaactgatg ctctgaaatg tgatcaaatc aacccagaat 2760 aacaacaaaa gagctggatt tgcaaatagg acaagtattt agaatcactg gtattaacag 2820 ctgtcatctt aattaaaata tagtgtctat ttagctgcct atttaagatt aaacacaaga 2880 gtggataact tcccaattta ctgggcctgg tttcaataga gtaaaaatat cagtcataga 2940 ttaattatag tgtcatgaaa gtatgagttg gaaacccttt ccttactttt taccttcatt 3000 tcttagttat tatttttttt tcttcacacc ctgatcaagc cactagtaag cacctatctg 3060 ctgcgagcta ttatatgact ttacagcaaa caacattgct gtgtggcctc tttggggaag 3120 ggaacaggat agcaggaggc tcaggctagc aagtctggac tcaacctaaa gccagaggca 3180 tggttgatag cagagaaagt gaggctcttc acaagtgggt gtgcttaagt aatcagaaac 3240 aggaaggctc tggttgatgg aattatcagt aagatatcta cccttatctc cttcttctat 3300 agaagctaaa ccgtctctcc ttcttgtgtg taggctgata aacacgcttg ttttcttttg 3360 agtgttcatg gctttgcaga ttttcagtgc tctgccagtt cttgttagag ggtttgttac 3420 cttgacacct gggcttggat gttagcatgc caaaggcaca cacttctgaa tgcctgtgta 3480 aaaggttatt attcatttac tttgtctttg gaaaggtgaa gtgtgtgtga gaaagaactc 3540 acaggagatg tattctctgt aggaaaactt ttttttcccc ttaaaagcct ataatccact 3600 ttcagtcaac tttgactttt ataccatgct gtcacatgaa agagtgttta ggcccgctct 3660 cgtggctctg ggaaaagcac caatagggga agaaatgtta tgccgagaaa tctgactggc 3720 agggaaactg ggtcagagct ccccaaagac cactacaggt gttaagtagg aacagtcgag 3780 ggtgggttca tataatagaa tggaacagag ggagggaaga taagctacaa agtttcatag 3840 ggtcctaagt ctttaagata caaaatagct ggttgggctt cataacaaag gaagtctggg 3900 aaggcagcaa gcattgagag ggagatggaa agggaaaaaa caatgtagag gatttgaaaa 3960 gctacaaatc ctccacgaga ggatttttct tggaggaatc tagaacaagg gtggtggatt 4020 aggtggatcg cagaaggact tgctttgcca tttgaatctg ggtttttgtc tctccattga 4080 ggttgagagc gtcacccttt tttaccctgg ataggaggag gaaagaaggg gtgttttgac 4140 tcctacctgg agttttacta gtttacgcaa tggaacagac actcgggacc tcctcttgac 4200 aagaaaaaaa aaaaaaaaag gaaacctgtt gtttctcttg tttgttcttt tgttaagaaa 4260 gcacaggcag ctgggcatgg tggcccatgc ctttaatccc agcatttggg aggcagaggc 4320 aggtgacttt ctaaattcaa ggccagcctg gtctacaaag tgagttccag gacagccagg 4380 gctatacaga gaaaccctgt ctcgggaaaa aaaaaaaaga agaaaagaaa agaaaagaag 4440 agaagaggag aggagaggag aggagaggag aggagaggag aggagaggag aggagaggag 4500 aggagaggag aagagaagag aagagaagag aagagaagag aagagaagag aagagaagag 4560 aagagaagag aagagaagag aagagaagag aagagaagag aagagaaaag aaaagagaaa 4620 agaaaagaaa aaagcaagca agcaagcact ggcaaagcat gcccacatgg gacgtatgtg 4680 ggtctttgag acaaggcttt tgaattgagc gctcatcaat agttgatcat ggtcaggtgg 4740 agggctacct gtcaggccga gccctgctgg cttagcactt aacatctcca ggtctcagta 4800 tcacttcctg ctgcttagca cagttaggag ttgagcaaac ctttttttcc aacccccact 4860 aaaatttaat ttacaaaagg cagtgtaatt tgtgggatac agtgtgataa ttgatctatg 4920 tgtgcattgt gcaaggttca ataaggtaga tcaataggcc catcaacagc tttatgggtg 4980 tgaaatgcaa gtaatatagg tagatgcctg tgtgtcctta ggtcagaaag gcatgatttt 5040 aaggtcttgg gcaaatcata ttatactcat gttaaaaatg cattatgttg attatcaatc 5100 ttttagagaa ggctgatact tggttttggt gctcagcaag caaatgtcac cagctctttc 5160 taactagtac cactttagaa aatgctaccc gtgctcaaat tggtttgtat tcttattttc 5220 atagcttgga gagagcggag agatcaaagc gatcggggaa ctggacctgc tgtttatgtc 5280 tctgagaaat gcttgcgtct gagcgagaag aagctagaaa acgaagaact gctccttcct 5340 gccttctaaa aagaacaata agatccctga atggactttt ttactaaagg aaagtgagaa 5400 gctaacgtcc accatcatta gaagatttca catgaaacct ggctcagttg aaagagaaaa 5460 tagtgtcaag ttgtccatga gaccagaggt agacttgata accacaaaga ttcattgaca 5520 atattttatt gtcattgata atgcaacaga aaaagtatgt actttaaaaa attgtttgaa 5580 aggaggttac ctctcattcc tctagaagaa aagcctatgt aacttcattt ccataaccaa 5640 tactttatat atgtaagttt atttattata agtatacatt ttatttatgt cagtttatta 5700 atatggattt atttatagaa aaattatctg atgttgatat ttgagtataa agcaaataat 5760 atttatgata ataactatag aaacaagata tcttaggctt taataaacac atgaatatca 5820 taaatcttct gtcttgtaat ttttctccct ttaatatcaa caataccatc atcgtcatca 5880 ttacccaatc attctcatga cttcatgctt gactcatatt atctggtaaa gtttg 5935 

1. An isolated nucleic acid molecule which encodes a T cell derived inducible factor, the complementary sequence of which hybridizes, under stringent conditions, to at least one of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24 or SEQ ID NO:
 25. 2. The isolated nucleic acid molecule of claim 1, wherein said isolated nucleic acid molecule encodes a protein having the amino acid sequence of the protein encoded by SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24 or SEQ ID NO:
 25. 3. The isolated nucleic acid molecule of claim 1, wherein said molecule is cDNA.
 4. The isolated nucleic acid molecule of claim 1, wherein said molecule is genomic DNA.
 5. The isolated nucleic acid molecule of claim 2, the nucleotide sequence of which consists of the nucleotide sequence SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24 or SEQ ID NO:
 25. 6. The isolated nucleic acid molecule of claim 4, having the nucleotide sequence of SEQ ID NO:
 25. 7. An isolated nucleic acid molecule which encodes the protein encoded by the isolated nucleic acid molecule of claim
 1. 8. Expression vector comprising the isolated nucleic acid molecule of claim 1, operably linked to a promoter.
 9. Expression vector comprising the isolated nucleic acid molecule of claim 2, operably linked to a promoter.
 10. Expression vector comprising the isolated nucleic acid molecule of claim 3, operably linked to a promoter.
 11. Expression vector comprising the isolated nucleic acid molecule of claim 4, operably linked to a promoter.
 12. Expression vector comprising the isolated nucleic acid molecule of claim 5, operably linked to a promoter.
 13. Expression vector comprising the isolated nucleic acid molecule of claim 6, operably linked to a promoter.
 14. Recombinant cell comprising the isolated nucleic acid molecule of claim
 1. 15. Recombinant cell comprising the isolated nucleic acid molecule of claim
 2. 16. Recombinant cell comprising the expression vector of claim
 8. 17. Recombinant cell comprising the expression vector of claim
 9. 18. Recombinant cell comprising the expression vector of claim
 10. 19. Recombinant cell comprising the expression vector of claim
 11. 20. Isolated protein encoded by the isolated nucleic acid molecule of claim 1, and having a molecular weight of about 17-30 kilodaltons as determined by SDS-PAGE.
 21. The isolated protein of claim 20, comprising at least 120 amino acids of the protein encoded by SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24, or SEQ ID NO:
 25. 22. The isolated protein of claim 21, comprising at least all but the 40 N terminal amino acids encoded by SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24, or SEQ ID NO:
 25. 23. The isolated protein of claim 22, comprising at least all but the 20 N terminal amino acids encoded by SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24, or SEQ ID NO:
 25. 24. Antibody which specifically binds to the isolated protein of claim
 20. 25. The antibody of claim 24, wherein said antibody is a monoclonal antibody.
 26. A method for determining effectiveness of interleukin-9 on a cell, comprising contacting said cell with an agent specific for at least one of (i) an isolated nucleic acid molecule which encodes a protein whose amino acid sequence is identical to the amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24 or SEQ ID NO: 25 and (ii) a protein whose amino acid sequence is identical to the amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29,SEQ ID NO: 24 or SEQ ID NO: 25 and determining interaction of said agent with (i) or (ii) as a determination of effectiveness of interleukin-9 on said cell.
 27. The method of claim 26, wherein said agent is an antibody which specifically binds to (ii).
 28. The method of claim 26, wherein said agent comprises the isolated nucleic acid molecule of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 29, SEQ ID NO: 24, or SEQ ID NO:
 25. 29. A method for stimulating activation of a STAT protein comprising administering an amount of the protein of claim 20 sufficient to stimulate activation of said STAT protein.
 30. The method of claim 29, wherein said STAT protein is STAT1 , STAT3 or STAT
 5. 31. A method of inhibiting activation of a STAT protein, comprising administering an amount of an antagonist of the protein of claim 20 sufficient to inhibit stimulation of expression of said STAT protein by said protein.
 32. The method of claim 31, wherein said STAT protein is STAT1 , STAT3 or STAT
 5. 33. A method for determining presence of TIF in a sample, comprising contacting said sample with an agent which finds to TIF or a nucleic acid molecule encoding TIF, and determining said binding as a determination of TIF in said sample.
 34. The method of claim 33, wherein said agent is an antibody.
 35. The method of claim 33, wherein said agent is a nucleic acid molecule.
 36. A method for screening to determine if a substance influences IL-9 activity, comprising adding said substance to a sample of TIF producing cells, in the presence of IL-9, and determining production of TIF, wherein a difference in production of TIF by said cells as compared to production of TIF by said cells in presence of IL-9 but not said substance indicates said substance influences IL-9 activity.
 37. The method of claim 36, wherein said substance is an IL-9 inhibitor or antagonist, said method further comprising determining lower levels of TIF production by said cells in the presence of said substance as compared to its absence.
 38. The method of claim 36, wherein said substance is an IL-9 activator, said method further comprising determining higher levels of TIF production by said cells in the presence of said substance as compared to its absence.
 39. A method for determining an aberrant level of IL-9 activity in a subject, comprising determining level of TIF in a subject and comparing said level to a normal level, differences therebetween being indicative of an aberrant level of IL-9 in said subject.
 40. The method of claim 39, wherein said aberrant level is excess endogenous IL-9.
 41. The method of claim 39, wherein said aberrant level is insufficient endogenous IL-9.
 42. The method of claim 40, wherein said subject suffers from asthma, an allergy, or lymphoma.
 43. A method for inhibiting IL-9 induced activity in a subject in need thereof, comprising administering an amount of a TIF inhibitor sufficient to inhibit IL-9 induced activity.
 44. The method of claim 43, wherein said TIF inhibitor is an antisense molecule.
 45. The method of claim 43, wherein said inhibitor is an antibody.
 46. A method for treating a subject suffering from asthma or an allergy, comprising administering to said subject an amount of a TIF mutein sufficient to alleviate said asthma or allergy.
 47. A method for determining if a mutein of TIF is therapeutically useful, comprising contacting a cell which produces IL-9 with said mutein, and determining effect of said mutein on prouction of IL-9, reduction thereof being indicative of possible thereapeutic efficacy for said mutein.
 48. A method for determining susceptibility to a condition characterized by aberrant expression of TIF, comprising determining nucleotide sequence of a TIF gene of a subject believed to possess an aberrant TIF gene, presence of an aberrant TIF gene being indicative of possible susceptibility to asthma or allergy.
 49. The method of claim 48, comprising contacting a sample taken from said subject with a pair of oligonucleotide primers which amplify said TIF gene. 